Comments-Oriented Document Summarization Based on Multi-aspect Co-feedback Ranking
نویسندگان
چکیده
With the popularity of Web 2.0, comments left by readers on web documents have drawn much attention. In this paper, we study the problem of comments-oriented document summarization, which aims to summarize a web document by considering not only its content but also the comments. Generally, most of the comments usually convey one or a few aspects of the document. Given a sentence set from both the web document and its corresponding comments to summarize, we can divide different sentences into different clusters (named “aspects”) according to the content. It is challenging and interesting to summarize the web document based on these clusters. Motivated by this, we propose a novel model: MultiAspectCoRank, for comments-oriented document summarization. Firstly we rank all the sentences based on the multiple aspects obtained from the whole document, and then provide each ranking list as feedback to others until the top-N results of each ranking list are unchanged. We get the final result by integrating these different ranking lists together. Experimental results on a set of real-world blog data with manually labeled sentences show the promising performance of our
منابع مشابه
Generating Aspect-oriented Multi-Document Summarization with Event-aspect model
In this paper, we propose a novel approach to automatic generation of aspect-oriented summaries from multiple documents. We first develop an event-aspect LDA model to cluster sentences into aspects. We then use extended LexRank algorithm to rank the sentences in each cluster. We use Integer Linear Programming for sentence selection. Key features of our method include automatic grouping of seman...
متن کاملGeneric Multi-Document Summarization Using Topic-Oriented Information
The graph-based ranking models have been widely used for multi-document summarization recently. By utilizing the correlations between sentences, the salient sentences can be extracted according to the ranking scores. However, sentences are treated in a uniform way without considering the topic-level information in traditional methods. This paper proposes the topic-oriented PageRank (ToPageRank)...
متن کاملDecayed DivRank for Guided Summarization
Guided summarization is essentially an aspect-based multi-document summarization, where aspects can be taken as specified queries in summarization. We proposed a novel ranking algorithm, Decayed DivRank (DDRank) for guided summarization tasks of TAC2011. DDRank can address relevance, importance, diversity, and novelty simultaneously through a decayed vertex-reinforced random walk process in sen...
متن کاملiDVS: An Interactive Multi-document Visual Summarization System
Multi-document summarization is a fundamental tool for understanding documents. Given a collection of documents, most of existing multidocument summarization methods automatically generate a static summary for all the users using unsupervised learning techniques such as sentence ranking and clustering. However, these methods almost exclude human from the summarization process. They do not allow...
متن کاملReader-Aware Multi-Document Summarization: An Enhanced Model and The First Dataset
We investigate the problem of readeraware multi-document summarization (RA-MDS) and introduce a new dataset for this problem. To tackle RA-MDS, we extend a variational auto-encodes (VAEs) based MDS framework by jointly considering news documents and reader comments. To conduct evaluation for summarization performance, we prepare a new dataset. We describe the methods for data collection, aspect...
متن کامل